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ABSTRACT 

It  is  shown  that  every  unambiguous  grammar  which  does  not 
generate  the  empty  string  is  covered  by  a  A-free  grammar.  Every  unambiguous 
grammar  which  does  generate  the  empty  string  is  covered  by  a  grammar  which  is 
partitioned  into  a  A-free  portion  and  a  portion  which  generates  only  the  empty 
string.  Finally,  every  unambiguous  grammar  is  covered  by  such  a  partitioned 
grammar  in  operator  form. 


1. 

1.   INTRODUCTION 


The  presence  of  A-rules  in  a  grammar  often  poses  tricky  problems 
for  practical  translators.  Thus,  practitioners  usually  restrict  themselves 
to  considering  A-free  grammars,  arguing  that  the  restriction  imposes  no 
significant  hardships.   In  this  paper,  we  present  supporting  evidence  for 
that  position  by  showing  that  every  unambiguous  grammar  is  completely 
covered  by  a  grammar  in  which  the  A-generating  portion  (if  present)  is 
completely  isolated  from  the  remaining  portion.  That  remaining  portion  is 
then  shown  to  be  covered  by  a  A-free  grammar.  As  a  followup,  it  is  then 
shown  that  such  an  unambiguous  "A-isolated"  grammar  is  covered  by  an 
operator  grammar.   This  strengthens  a  result  of  Gray  and  Harrison  [3] • 

Definition.   A  context-free  (CF)  grammar  is  a  *+-tuple  G  =  (V,Z,  P,  S)  where: 


(a 
(b 
(c 

(a 

(e 


V  is   a  finite   non-empty   set   of   symbols    (vocabulary); 
Zc  V   (terminal  vocabulary) ; 
N=V-Z   (non-terminal  vocabulary) ; 


SeN   (goal   symbol),    and; 

P  is  a  finite  subset  of  N  x  V*  (production) . 

We  will  denote  an  element  (u, v)  of  P  by  u->v,  and  we  will  often 
ascribe  indices  to  productions:  it.  =  u->v.  We  also  employ  the  usual  binary 
relation  =>  e  V*  x  V*,  writing  u  =>  v  instead  of  (u,v)  e  =^  . 


Let  X  and  Y  be  sets  of  words.  Write  XY  =  {xy  |xeX,yeY}  where  xy  is  the 

concatenation  of  x  and  y.  Define  X  =  [A)  where  A  is  the  null  word.  For 

each  i  >  0,  define  X1+1  =  XXX  and  X*  =  U.^-X1.  Let  X+  =  X*X  and  let 
—  i>0 

0  denote  the  empty  set . 


2. 

Definition.   Let  u,  v  e  V  .  Define  u=>vif  there  exist  words  x,  y,  w  e  V 
and  A  €  N  so  that  u  =  xAy,  v  =  xwy,  and  A  ^w  is  in  P.   If  y  e  £  ,  we  write 
u  i|>  v.   Furthermore,  we  will  write  the  reflexive-transitive  closure  of  =^> 
as  i  •   If  we  wish  to  make  clear  that  the  grammar  G  is  being  used,  we  will 

write  =^>  . 
G 

Definition.   Let  x.  e  V  (o  <  i  <  r).   If  x.  =^x.  ,  by  applying  the  production 

it.  ,  then  we  say  that  x.  directly  derives  x.  ,  via  jt.  .   If  x  =>  x,  =>  . . .  =»  x 
1'  J  i x+1 i      o    1         r 

where  x.  directly  derives  x.   via  jt.  (for  all  0  <  i  <  r)  then  we  say  that 

r-1  r-1 

x  derives  x  via  (^.).  "  and  that  (n.  )•    is  a  derivation  of  x„  from  x  . 
o         r       iyi=o  l  i=o  T       o 


R  r-1 

If  x.  =^>x.  t  (for  all  0  <  i  <  r)  then  (jt.  ).    is  called  a  canonical  de- 

rivation  of  x  from  x  . 
r o 

Definition.  We  define  L(G;X)  =  {xeZ  |X=»x)  for  all  X  e  V.   L(G;X)  is  called 

G 

the  language  generated  by  G  from  X,   If  X  =  S,  the  goal  symbol  of  G,  then  L(G;S) 

is  called  simply  the  language  generated  by  G,  and  is  denoted  by  L(G). 

Definition.   A  CF  grammar  G  is  said  to  be  unambiguous  if  and  only  if 

for  all  x  e  L(G;S)  and  for  all  canonical  derivations  (n.)-  n?  («•)•  -,  which 
v  '  '  v  i/i=l'  v  i'x=l 

derive  x  from  S,  n  =  n'  and  n.  =  n.'  (for  all  1  <  i  <  n).   A  CF  grammar  which 

l    l  v  —   —  ' 

is  not  unambiguous  is  said  to  be  ambiguous . 

Definition.   A  context-free  grammer  G  =  (V,  Z,   P,  S)  is  said  to  be 

(a)  /y-free  if  P  c  N  x  V+; 

(b)  reduced  if 

(i)   for  each  A  e  V,  there  exist  y,  y  e  V  so  that  S  =>  XAy,  and 

-x-  -■* 
(li)   for  each  A  e  N  there  exists  x  e  Z     so  that  A  ==>  x; 

(c)  in  operator  form  if  P  e  N  x  (V  -  V  8   V  ) ; 

P 

(d)  in  canonical  two  form  if  P  c  W  x  (CaI  U  V  U  N  ). 


3. 

Following  Gray  and  Harrison  [3]>  we  define  the  notion  of  a  cover. 
Definition:   Let  G  =  (V,  Z,   P,  S)  be  a  grammer  and  let  H  c  P.   Let 

D  =  (A.  _x.)n  , 
be  a  canonical  derivation  in  G.   Then  the  corresponding  H- sparse  derivation 

is 

DTT  =  (A.  _»x.  |a.  _»x.  is  in  H)n  ,  . 
H     1    l '  i    l        '1=1 

Definition.   Let  G  =  (V,  E,  P,  S)  and  G '  =  (V,  Z,    P',  S')  be  context-free 

grammars.   Let  H  c  P  and  H'  c  P'.   Let  cp  be  a  map  from  H'  into  H.   For  any 

.n  * 

canonical  derivation  D  =  (A.  ->x.).  .  in  G'  of  some  x  e   Z  ,   define  the 

image  of  D  under  cp  to  be  cp  (D)  =  (cp(A.  _>x.  )  |a.  _»x.  is  in  H')._  .  Cp(D) 

is  an  element  of  H  .   (G'5  H')  is  said  to  cover  (G,  H)  under  Cp  iff 

(a)  L(G)  =  L(G'),  and 

(b)  for  each  x  e  L(G), 

(i)  if  D  is  an  H-sparse  derivation  of  x  in  G  then  there  is  an 
H'-sparse  derivation  D'  of  x  in  G'  so  that  <pD '  =  D,  and 

(ii)  if  D'  is  an  H'-sparse  generation  of  x  in  G'  then  cpD '  is 
an  H-sparse  generation  of  x  in  G. 

G'  is  said  to  cover  (G,  H)  if  some  H'  and  cp  exist  such  that  (G',  H') 
covers  (G,  H)  under  cp.   If  G'  covers  (G,  P)  we  say  G'  completely  covers  G. 
2.  Main  Results 

We  turn  immediately  to  the  development  of  our  claims . 
Claim  1.  Every  CF  grammar  G  =  (V,  Z,   P,  S)  is  completely  covered  by  a 
grammar  G'  =  (V  Z,   P',  S')  for  which: 

a)  S'  ^S  g  P'J 

b)  S'  ^S^  e  P'J 

c)  S'  does  not  appear  in  the  right  part  of  any  production  in  P'; 

d)  L(G';S)  -  L(G;S)  v  (A),  and; 

e)  L(G';SA)  =  L(G;S)  -  L(G';S). 


k. 

(Thus,  L(G';S)  consists  of  all  non-null  sentences  of  L(G;S)  Furthermore, 

L(G';S  )  =  {A)  if  A  e  L  (G;s)  and  L(G*;SA)  =  <?   otherwise.) 

Proof.  We  may  assume  [3],  without  loss  of  generality  that  G  is  in  canonical 

two  form.  Define  V*  =  V  U  {S ' }  U  {A |a  e  Nj  and  P'=PL  U  Pg  U  P3  U  P^  U  P5 

where 

p1  =  {s-  _>s,  s-  ->sA) 

P2  =  {A  _»B,  AA  -»BA|A,  B  e  N;  A  _B  e  P} 
P  =  {A  _>BC,  A  -^BAC,  A  ->B  C^, 

AA  ^BA  C  Ia,  B,  C  e  N;  A'  -»B  C  e  P} 

A    A  A  [ 

P^  =  {AA  ^A|A  €  N,  A  _>A  6  P} 

Pr  =  {A  ^a|A  e  N,  a  e  Z,  A  _» a  e  P} . 

5 

Properties  a)  -  e)  hold  by  construction.   It  is  also  clear  that 

(G',  P2  U  P3  U  P^  U  P5)  covers  (G,P). 

Hence  G'  completely  covers  G.  Moreover,  the  canonical  two  form  is  preserved.  D 

A  grammar  G*  =  (V,  Z,  P'.  S)  which  satisfies  a)  -  c)  of  Claim  1 
and  for  which  A  |  L  (G*;S»)  and  L  (G';S )c  {a},  is  called  a  A-split  grammar. 

Having  obtained  such  a  A-  split  grammar,  we  wish  to  obtain  a  A-  free 
cover  for  the  portion  which  generates  L(G';S).  The  following  claim  estab- 
lishes that  ability. 

Claim  2.   Every  unambiguous  CF  grammar  G'  =  (V1,  Z,  P1,  S)  for  which  A  ^  L  (G';S) 
is  completely  covered  by  a  A- free  grammar. 

Proof.  We  may  assume  without  loss  of  generality  that  G:  is  in  reduced  canoni- 
cal two  form  and  is  A-split  via  the  transformation  of  Claim  1.   Since  G'  is 

unambiguous,  it  follows  that  for  each  B.  e  N' >   the  length  of  the  canonical 

*R  i      \ 

derivation  B  =>  A  is  finite.    Let  uAB  )  denote  that  length,  and  let 
A  G,  A 

M  =  {1,  2,...  ,max  (u(B  ))} 
B  eN'    A 

Let  G"  (V",  Z,  P",  S)  where  V"  =  V  U  (N'xN'xM). 


5. 


Define  P"  =  P '   U  P '  U  P'  where 


P^  =  {A^v  |  A  e  N',  v  e  v,+  ,  A  _>v  e  P'} 

P'  =  {A  _^(BA,C,1),(BA,C,1)  _>(BA,C,2),  ...  ,  (BA,C,|i(BA)-l)  _»  (B^C,^) ) , 
(BA,C,u(BA))  -»C  |  A,  BA,  C  e  N';  A  _>  B^  e   P'}. 

P^  -  {A  ^(B,CA,1),  (B,CA,1)  _+(B,CA,2),  ...  ,  (B,Ca,u(Ca)-1)  _»  (B,CA,U(CA) ) , 

(B,CA,n(CA))  _>B  |  A,  B,  CA  G  N'j  A  _»  BCA  €  P'}. 
We  define  cp  by  cases : 

a)  cp   (A  _>v  e  P£)   =  A  ->v; 

b)  en    (A  ^(BA,C,1)    6  P£)   =  a   ->BAC; 

e)  cp  ((BAC,i)   _*(BA,C,i+l)   e  Pp  =  ^  for  1  <  i  <  U(B  ); 

d)  cp  ((B   ,CJkl(B   ))    _C   e  P' )  '=  *    ,      ,   where  B.   *4  A  via   (*    )^(V; 

e)  cp  (A  -»(B,CA,1)    e  P3)   -  A  ^BCA; 

f )  rp  ((B,CA,i)    _»  (B,CA,i+l)    e  P^)   =  ^  for  1  <  i  <  ^(c    ),    and; 

g)  cp  ((B,CA,n(CA))  ^B  e  Pp  =  n'(c  }  where  CA  *4  A  via  (^)^A}. 

^   A  G ' 

It  is  easily  seen  that  G"  completely  covers  G'  under  cp.   Furthermore,  G"  is, 

by  construction,  A-free  and  in  canonical  two  form. 

□ 

Theorem  2.  1.  Every  unambiguous  CF  grammar  G  =  (V,  2,  P,  S)  is  completely 
covered  by  a  A-split  grammar  G'  =  (V,  £,  P',  S')  for  which 

a)  S'  ^S  €  P*s 

b)  S'  _,SA  e  P*; 

c)  L  (G';S)  -  L  (G;S)    {A},  and; 

d)  the  reduced  grammar  obtainable  from  (V,  I,  P',  S)  is  A- free. 

Proof.   The  proof  follows  directly  from  Claims  1  and  2. 

□ 


6. 


A  A-split  grammar  G'  which  satisfies  a)  -  d)  of  Theorem  2.1 
is  called  a  A -isolated  grammar.    If  A  4  L  (G';S'),  then  G'  is  clearly 

A -free  as  well.  As  an  example  of  the  application  of  these  transforma- 
tions, we  see  that  the  grammar  with  productions 

S  ^AB 

A  _»a   |   A 

B  -*b   |   CD 

C  _»A 

D   _>A 
is   completely  covered  by  the   A-split  grammar  with  productions 

s-  _s  |  sA 

S    _AB    |    AAB    |  ABA 


A   _^>a 

B  _»b 

SA  -AABA 

(u(SA) 

=  5) 

AA^A 

(u(AA) 

=  1) 

BA  -°ADA 

(H(BA) 

=  3) 

CA^A 

(u(CA) 

=  1) 

DA^A 

(W(DA) 

=  1) 

7- 
which  is   completely  covered  by  the   A-isolated  grammar  with  productions 

S'   ->S    |  sA 

S   ^AB    |fAA,B,l)     |    (A,BA,1) 
A   ^a 
B  ^b 

(AA,B,1)  ^B 

(A,BA,1)  _»(A,BA,2) 

(A,BA,2)  ^(A,BA,3) 

(A,BA,3)  ^A 

sa-Va 

AA-A 

BA^CADA 

°A-A 

DA-A 

Gray  and  Harrison  have  shown  [3]  that  the  A  -  free  portion  of  a 

A  -  isolated  grammar  can  be  completely  covered  by  an  operator  grammar. 

We  will  now  show  that,  in  fact  the  entire  grammar  can  be  so  covered. 

Claim  3.   Every  unambiguous  grammar  G  =  (V,  0,  P,  S.)  for  which  L(G;S  )  =  { A} 

is  completely  covered  by  an  operator  grammar. 

Proof:  We  may  assume  without  loss  of  generality,  that  G  is  in  reduced 

canonical  two  form.   Since  G  is  unambiguous,  it  follows  that  the  length 

of  the  canonical  derivation  S  =^>  A  is  finite.  Denote  that  length  by  ^(S  ). 

A  Q 

and  let  M  =  {1,2,  ...  ,  |i(S  )}.   Let  G'  =  (V,  0f,  P',  S  )  where 
V"  =  {S  }  U  (WxM).  Define  P»  by 

P.  =  {SA^(SA,1),(SA,D  ->(SA,2),  ...  ,  (SA,n(SA)-l)  -♦lBA,n(BA))1 

(SA,u(SA))  -.A). 

Clearly,  the  derivation  in  G'  isomorphic  to  the  derivation  in  G.   Thus  G' 

completely  covers  G.   Furthermore,  G'  is,  by  construction,  an  operator  grammar, 

D 


8. 

By  a  construction  similar  to  that  used  in  Claim  2,  it  is  possible 
to  prove  the  following: 

Theorem  2.2.   Every  operator  grammar  is  completely  covered  by  aA-isolated 
operator  grammar. 
Proof.   Omitted.  □ 

We  may  now  state  our  main  result. 
Theorem  2.3.  Every  unambiguous  CF  grammar  is  completely  covered  by  a  A  - 
isolated  operator  grammar. 

Proof:   The  proof  follows  immediately  from  Theorem  2.1,  Gray  and  Harrison's 
Theorem  1.2  [3],  Claim  3,  and  Theorem  2.2. 

D 


9. 
3.   Summary  and  Conclusions 

We  have  shown  that  it  is  often  possible  to  remove  A-rules  from  a 
grammar  without  significantly  altering  its  parse  trees,  and  hence  its  at- 
tached semantics.   This  may  be  taken  as  supporting  evidence  of  the  commonly 
held  notion  that  in  practical  situations  (i.e.  in  the  case  of  translators 
for  unambiguous  grammars  which  do  not  generate  the  empty  string),  one  can, 
without  difficulty,  dispense  with  A-rules. 

More  surprisingly,  we  have  been  able  to  strengthen  the  operator 
grammar  results  of  Gray  and  Harrison  [3]«   There  is,  of  course,  no  contra- 
diction between  their  Theorem  1.3  and  our  Theorem  2.3.   Their  Theorem  1.3 
holds  that  there  is  a  grammar  (namely  the  one  with  productions  S  _»  SS  f  A) 
which  cannot  be  covered  by  any  operator  grammar.   Our  results  show  that 
their  Theorem  1.3  is  rooted,  not  in  the  fact  that  the  grammar  contains  a 
A-rule,  but  rather  in  the  ambiguity  of  the  grammar. 


10. 


REFERENCES 


1.  Eloyd,  R.W.   Syntatic  Analysis  and  Operator  Precedence.   J. ACM  10,3 
(July,  1963),  316-333. 

2.  Ginsburg,  S.   The  Mathematical  Theory  of  Context  Free  Languages. 
McGraw-Hill,  New  York  (1966). 


3.   Gray,  J.N.,  and  Harrison,  M.A.   On  the  Covering  and  Reduction 

Problems  for  Context-Free  Grammars.   J. ACM  19,  h   (October,  1972), 
675-698. 


BIBLIOGRAPHIC  DATA 
SHEET 


1.    Report  No. 

UIUCDCS-R- 7^-62^ 


3.  Recipient's  Accession  No. 


i.  Title  .ind  Subtitle 


5.   Report  Date 


ON  THE   COVERING   PROBLEM  FOR  UNAMBIGUOUS   CONTEXT-FREE 
GRAMMARS 


February ,    197^ 


1.  Author(s ) 


M.  Dennis  Mickunas 


8.    Performing  Organization  Rept. 
No. 


>.  Performing  Organization  Name  and  Address 

Department  of  Computer  Science 
University  of  Illinois 
Urbana,  Illinois  6l801 


10.   Project/Task/Work  Unit  No. 


11.  Contract/Grant  No. 


2.  Sponsoring  Organization  Name  and  Address 

Department  of  Computer  Science 
University  of  Illinois 
Urbana,  Illinois  6l801 


13.   Type  of  Report  &  Period 
Covered 

Research 


14. 


5.  Supplementary  Notes 


6.  Abstracts 


It  is  shown  that  every  unambiguous  grammar  which  does  not  generate 
the  empty  string  is  covered  by  a  A- free  grammar.   Every  unambiguous 
grammar  which  does  generate  the  empty  string  is  covered  by  a  grammar 
which  is  partitioned  into  a  A-free  portion  and  a  portion  which 
generates  only  the  empty  string.   Finally,  every  unambiguous 
grammar  is  covered  by  such  a  partitioned  grammar  in  operator  form. 


7.  Key  Words  and  Document  Analysis.     17a.  Descriptors 


overs,  parsing,  ambiguity,   A-free, 
>perator  grammars,  context-free  grammars 


b.  Identifiers ,  Open-Ended  Terms 


c.   COSATI  Field/Group 


Availability  Statement 

RELEASE  UNLIMITED 


19.  Security  Class  (This 
Report) 

UNCLASSIFIED 


20.  Security  Class  (This 
Page 

UNCLASSIFIED 


21.    No.  of   Pages 

12 


22.   Price 


IRM  NTIS-35   (10-70) 


USCOMM-DC    40329-P7  1 


JUL  2^ 


1974 


WW  5 


1977 


UNIVERSITY  OF  ILLINOIS-URBANA 
510  84  lien  no  C002  no  618  628(1974 
Quldt  in  Inlormillon  lytltm 


Hi  H       « 

Bgjffl«f  N  SPSlEfflBB  safe 

IS      §8  g» 
«  ffiH  HHi     H  mi 
m    m  m 

H    m  Hi 


^^B 


